Temporal Trends Analysis

Japan-Vietnam Research Collaborations (1973-2024)

Author

Your Name

Published

November 4, 2025

1 Executive Summary

This report analyzes temporal trends in 9982 journal articles resulting from Japan-Vietnam research collaborations (1973-2024). Key findings include:

  • Publications grew exponentially, particularly after 2010
  • Multilateral collaborations surpassed bilateral in 2019
  • Open Access adoption crossed 50% in 2017
  • Physical Sciences dominated early; Life Sciences grew rapidly post-2010

2 1. Publication Growth Patterns

2.1 1.1 Overall Output Trajectory

Show code
# Calculate annual counts
annual_output <- data_reg |> 
  filter(year >= cutoff_year) |> 
  count(year, name = "publications")

# Calculate growth statistics
total_pubs <- nrow(data_reg)
recent_5yr <- data_reg |> filter(year >= 2020) |> nrow()
pct_recent <- round(100 * recent_5yr / total_pubs, 1)

# Plot
p1 <- ggplot(annual_output, aes(x = year, y = publications)) +
  geom_line(color = "#2E86AB", linewidth = 1.2) +
  geom_point(color = "#2E86AB", size = 2) +
  geom_smooth(method = "loess", se = TRUE, color = "#A23B72", 
              linetype = "dashed", linewidth = 0.8) +
  scale_y_continuous(labels = comma) +
  labs(
    title = "Japan-Vietnam Research Output (1990-2024)",
    subtitle = glue::glue("{pct_recent}% of all publications occurred in last 5 years"),
    x = "Year",
    y = "Number of Publications",
    caption = "Source: Scopus | Loess smoothing curve shown"
  )

p1

Annual publication output showing exponential growth post-2010
Key Insight

Publications increased 861-fold from 1990 to 2024, with acceleration visible after 2010.

2.2 1.2 Growth Rates by Period

Show code
# Define periods
period_breaks <- c(1990, 2000, 2010, 2015, 2020, 2025)
period_labels <- c("1990-1999", "2000-2009", "2010-2014", 
                   "2015-2019", "2020-2024")

# Calculate CAGR for each period
growth_rates <- data_reg |> 
  filter(year >= cutoff_year) |> 
  mutate(period = cut(year, breaks = period_breaks, 
                      labels = period_labels, right = FALSE)) |> 
  filter(!is.na(period)) |> 
  group_by(period) |> 
  summarise(
    n_pubs = n(),
    years = n_distinct(year),
    .groups = "drop"
  ) |> 
  mutate(
    avg_annual = n_pubs / years,
    cagr = ((n_pubs / lag(n_pubs))^(1/5) - 1) * 100
  )

# Plot
p2 <- ggplot(growth_rates, aes(x = period, y = avg_annual)) +
  geom_col(fill = "#2E86AB", alpha = 0.8) +
  geom_text(aes(label = round(avg_annual, 0)), 
            vjust = -0.5, size = 4, fontface = "bold") +
  labs(
    title = "Average Annual Publications by Period",
    x = "Period",
    y = "Average Publications per Year"
  )

p2

Compound annual growth rates across different time periods
Show code
growth_rates |> 
  select(Period = period, 
         `Total Pubs` = n_pubs,
         `Avg/Year` = avg_annual,
         `CAGR (%)` = cagr) |> 
  mutate(`Avg/Year` = round(`Avg/Year`, 1),
         `CAGR (%)` = round(`CAGR (%)`, 1)) |> 
  kable(align = "lrrr") |> 
  kable_styling(bootstrap_options = c("striped", "hover"))
Publication growth statistics by period
Period Total Pubs Avg/Year CAGR (%)
1990-1999 148 16.4 NA
2000-2009 1009 100.9 46.8
2010-2014 1227 245.4 4.0
2015-2019 2728 545.6 17.3
2020-2024 4865 973.0 12.3

2.3 1.3 Exponential vs Linear Growth Modeling

Show code
# Prepare data for modeling
model_data <- annual_output |> 
  mutate(
    year_index = year - min(year),
    log_pubs = log(publications + 1)
  )

# Fit models
linear_model <- lm(publications ~ year_index, data = model_data)
exp_model <- lm(log_pubs ~ year_index, data = model_data)

# Extract R-squared
r2_linear <- glance(linear_model)$r.squared
r2_exp <- glance(exp_model)$r.squared

# Generate predictions
model_data <- model_data |> 
  mutate(
    pred_linear = predict(linear_model),
    pred_exp = exp(predict(exp_model)) - 1
  )

# Plot
p3 <- ggplot(model_data, aes(x = year)) +
  geom_point(aes(y = publications), size = 2, alpha = 0.6) +
  geom_line(aes(y = pred_linear, color = "Linear"), 
            linewidth = 1) +
  geom_line(aes(y = pred_exp, color = "Exponential"), 
            linewidth = 1) +
  scale_color_manual(
    values = c("Linear" = "#E63946", "Exponential" = "#06A77D"),
    labels = c(
      paste0("Linear (R² = ", round(r2_linear, 3), ")"),
      paste0("Exponential (R² = ", round(r2_exp, 3), ")")
    )
  ) +
  labs(
    title = "Growth Model Comparison",
    subtitle = "Exponential model provides better fit to recent data",
    x = "Year",
    y = "Publications",
    color = "Model"
  )

p3

Comparison of linear and exponential growth models
Model Selection

The exponential model (R² = 0.949) fits better than linear (R² = 0.779), suggesting accelerating growth in collaboration intensity.

2.4 1.4 Growth Phase Identification

Show code
# Calculate year-over-year growth rate
yoy_growth <- annual_output |> 
  arrange(year) |> 
  mutate(
    pct_change = (publications - lag(publications)) / lag(publications) * 100,
    growth_rate_ma = zoo::rollmean(pct_change, k = 3, fill = NA, align = "center")
  )

# Identify phases (simplified)
phases <- tibble(
  phase = c("Emergence", "Acceleration", "Rapid Growth"),
  start = c(1990, 2005, 2015),
  end = c(2004, 2014, 2024),
  color = c("#264653", "#2A9D8F", "#E76F51")
)

# Plot
p4 <- ggplot(yoy_growth, aes(x = year, y = pct_change)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  geom_col(fill = "#2E86AB", alpha = 0.6) +
  geom_line(aes(y = growth_rate_ma), color = "#A23B72", 
            linewidth = 1.2, na.rm = TRUE) +
  geom_vline(xintercept = c(2005, 2015), 
             linetype = "dashed", color = "red", alpha = 0.5) +
  annotate("text", x = 1997, y = max(yoy_growth$pct_change, na.rm = TRUE), 
           label = "Emergence", size = 4, color = "#264653") +
  annotate("text", x = 2010, y = max(yoy_growth$pct_change, na.rm = TRUE), 
           label = "Acceleration", size = 4, color = "#2A9D8F") +
  annotate("text", x = 2020, y = max(yoy_growth$pct_change, na.rm = TRUE), 
           label = "Rapid Growth", size = 4, color = "#E76F51") +
  labs(
    title = "Year-over-Year Growth Rate (%)",
    subtitle = "3-year moving average shown in purple | Phase transitions marked",
    x = "Year",
    y = "Growth Rate (%)"
  )

p4

Distinct growth phases with inflection points marked

3 2. Collaboration Type Evolution

3.1 2.1 Bilateral vs Multilateral Proportions

Show code
# Calculate proportions
coop_trends <- data_reg |> 
  filter(year >= cutoff_year, !is.na(coop)) |> 
  group_by(year, coop) |> 
  summarise(n = n(), .groups = "drop") |> 
  group_by(year) |> 
  mutate(
    total = sum(n),
    pct = n / total * 100
  )

# Find crossover point
crossover <- coop_trends |> 
  select(year, coop, pct) |> 
  pivot_wider(names_from = coop, values_from = pct) |> 
  filter(multilateral > bilateral) |> 
  slice_min(year, n = 1) |> 
  pull(year)

# Plot
p5 <- ggplot(coop_trends, aes(x = year, y = pct, color = coop, group = coop)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  geom_vline(xintercept = crossover, linetype = "dashed", 
             color = "red", alpha = 0.5) +
  annotate("text", x = crossover, y = 75, 
           label = paste("Crossover:", crossover), 
           angle = 90, vjust = -0.5, size = 4) +
  scale_color_manual(values = colors_coop) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Evolution of Collaboration Types",
    subtitle = glue::glue("Multilateral surpassed bilateral in {crossover}"),
    x = "Year",
    y = "Percentage of Publications",
    color = "Cooperation Type"
  )

p5

Shift from bilateral to multilateral collaborations over time

3.2 2.2 Network Expansion (Number of Countries)

Show code
# Calculate average countries per year
country_trends <- data_reg |> 
  filter(year >= cutoff_year, !is.na(n_countries)) |> 
  group_by(year) |> 
  summarise(
    mean_countries = mean(n_countries),
    median_countries = median(n_countries),
    max_countries = max(n_countries),
    .groups = "drop"
  )

# Plot
p6 <- ggplot(country_trends, aes(x = year)) +
  geom_line(aes(y = mean_countries, color = "Mean"), linewidth = 1.2) +
  geom_line(aes(y = median_countries, color = "Median"), linewidth = 1.2) +
  geom_ribbon(aes(ymin = median_countries, ymax = max_countries), 
              alpha = 0.2, fill = "#2E86AB") +
  scale_color_manual(
    values = c("Mean" = "#2E86AB", "Median" = "#A23B72")
  ) +
  labs(
    title = "Network Expansion Over Time",
    subtitle = "Average and median number of countries per publication",
    x = "Year",
    y = "Number of Countries",
    color = "Metric"
  )

p6

Average number of countries involved per publication over time
Show code
# Create heatmap of country counts
country_dist <- data_reg |> 
  filter(year >= cutoff_year, !is.na(n_countries)) |> 
  mutate(
    period = cut(year, 
                 breaks = c(1990, 2000, 2010, 2020, 2025),
                 labels = c("1990-1999", "2000-2009", 
                           "2010-2019", "2020-2024"),
                 right = FALSE),
    country_group = case_when(
      n_countries == 2 ~ "2 countries",
      n_countries == 3 ~ "3 countries",
      n_countries == 4 ~ "4 countries",
      n_countries >= 5 ~ "5+ countries",
      TRUE ~ "Other"
    )
  ) |> 
  filter(!is.na(period)) |> 
  count(period, country_group) |> 
  group_by(period) |> 
  mutate(pct = n / sum(n) * 100)

# Plot
p7 <- ggplot(country_dist, aes(x = period, y = country_group, fill = pct)) +
  geom_tile(color = "white") +
  geom_text(aes(label = paste0(round(pct, 1), "%")), 
            color = "white", fontface = "bold") +
  scale_fill_gradient(low = "#E8F4F8", high = "#2E86AB") +
  labs(
    title = "Distribution of Collaboration Network Size",
    x = "Period",
    y = "Number of Countries",
    fill = "Percentage"
  ) +
  theme(legend.position = "right")

p7

Distribution of collaboration sizes over time

3.3 2.3 Predictors of Multilateralization

Show code
# Prepare data
multil_data <- data_reg |> 
  filter(year >= cutoff_year, !is.na(coop)) |> 
  mutate(
    is_multilateral = as.integer(coop == "multilateral"),
    year_scaled = scale(year)[,1],
    oa_binary = as.integer(OA == "OA"),
    funded_binary = as.integer(fund == "Funded")
  )

# Fit logistic regression
multil_model <- glm(
  is_multilateral ~ year_scaled + oa_binary + funded_binary + 
    LS + SS + PS + HS,
  data = multil_data,
  family = binomial()
)

# Tidy results with odds ratios
multil_results <- tidy(multil_model) |> 
  mutate(
    odds_ratio = exp(estimate),
    conf_low = exp(estimate - 1.96 * std.error),
    conf_high = exp(estimate + 1.96 * std.error),
    significance = case_when(
      p.value < 0.001 ~ "***",
      p.value < 0.01 ~ "**",
      p.value < 0.05 ~ "*",
      TRUE ~ ""
    )
  ) |> 
  select(Term = term, `Odds Ratio` = odds_ratio, 
         `95% CI Low` = conf_low, `95% CI High` = conf_high,
         `p-value` = p.value, ` ` = significance)

multil_results |> 
  mutate(across(where(is.numeric), ~round(., 3))) |> 
  kable(align = "lrrrrl") |> 
  kable_styling(bootstrap_options = c("striped", "hover"))
Logistic regression: Predictors of multilateral (vs bilateral) collaboration
Term Odds Ratio 95% CI Low 95% CI High p-value
(Intercept) 0.821 0.724 0.931 0.002 **
year_scaled 1.284 1.229 1.343 0.000 ***
oa_binary 1.089 1.003 1.183 0.043 *
funded_binary 1.351 1.242 1.469 0.000 ***
LS 0.775 0.704 0.853 0.000 ***
SS 0.791 0.683 0.915 0.002 **
PS 0.768 0.691 0.853 0.000 ***
HS 1.426 1.274 1.597 0.000 ***
Finding

Recent years, OA status, and funding significantly predict multilateral collaboration (p < 0.001).


4 3. Open Access Adoption

4.1 3.1 OA Penetration Rates

Show code
# Calculate OA proportions
oa_trends <- data_reg |> 
  filter(year >= cutoff_year) |> 
  group_by(year, OA) |> 
  summarise(n = n(), .groups = "drop") |> 
  group_by(year) |> 
  mutate(
    total = sum(n),
    pct = n / total * 100
  )

# Find 50% crossover
oa_crossover <- oa_trends |> 
  filter(OA == "OA") |> 
  filter(pct >= 50) |> 
  slice_min(year, n = 1) |> 
  pull(year)

# Plot
p8 <- ggplot(oa_trends, aes(x = year, y = pct, fill = OA)) +
  geom_area(alpha = 0.7) +
  geom_vline(xintercept = oa_crossover, linetype = "dashed", 
             color = "black", alpha = 0.5) +
  annotate("text", x = oa_crossover, y = 75, 
           label = paste("50% OA:", oa_crossover), 
           angle = 90, vjust = -0.5, size = 4) +
  scale_fill_manual(values = colors_oa) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Open Access Adoption Over Time",
    subtitle = glue::glue("OA surpassed 50% in {oa_crossover}"),
    x = "Year",
    y = "Percentage of Publications",
    fill = "Access Type"
  )

p8

Open Access adoption trajectory showing rapid growth post-2010

4.2 3.2 OA Adoption Speed by Discipline

Show code
# Calculate OA by discipline
oa_discipline <- data_reg |> 
  filter(year >= cutoff_year) |> 
  select(year, OA, LS, SS, PS, HS) |> 
  pivot_longer(cols = LS:HS, names_to = "domain", values_to = "value") |> 
  filter(value == 1) |> 
  mutate(is_oa = as.integer(OA == "OA")) |> 
  group_by(year, domain) |> 
  summarise(
    oa_rate = mean(is_oa, na.rm = TRUE) * 100,
    .groups = "drop"
  )

# Plot
p9 <- ggplot(oa_discipline, aes(x = year, y = oa_rate, color = domain)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  geom_hline(yintercept = 50, linetype = "dashed", alpha = 0.3) +
  scale_color_manual(
    values = colors_domain,
    labels = c("HS" = "Health Sciences", "LS" = "Life Sciences",
               "PS" = "Physical Sciences", "SS" = "Social Sciences")
  ) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Open Access Adoption by Domain",
    x = "Year",
    y = "OA Rate (%)",
    color = "Domain"
  )

p9

Open Access adoption rates vary significantly by discipline

4.3 3.3 OA vs Non-OA Growth Trajectories

Show code
# Count by year and OA
oa_counts <- data_reg |> 
  filter(year >= cutoff_year) |> 
  count(year, OA)

# Plot
p10 <- ggplot(oa_counts, aes(x = year, y = n, color = OA)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  scale_color_manual(values = colors_oa) +
  scale_y_continuous(labels = comma) +
  labs(
    title = "Open Access vs Non-OA Publication Counts",
    subtitle = "OA publications now dominate the collaboration output",
    x = "Year",
    y = "Number of Publications",
    color = "Access Type"
  )

p10

Absolute counts showing OA overtaking non-OA

5 4. Disciplinary Shifts

5.1 4.1 Domain Dominance Over Time

Show code
# Calculate domain counts
domain_trends <- data_reg |> 
  filter(year >= cutoff_year) |> 
  select(year, LS, SS, PS, HS) |> 
  pivot_longer(cols = LS:HS, names_to = "domain", values_to = "value") |> 
  filter(value == 1) |> 
  count(year, domain)

# Plot
p11 <- ggplot(domain_trends, aes(x = year, y = n, color = domain)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  scale_color_manual(
    values = colors_domain,
    labels = c("HS" = "Health Sciences", "LS" = "Life Sciences",
               "PS" = "Physical Sciences", "SS" = "Social Sciences")
  ) +
  scale_y_continuous(labels = comma) +
  labs(
    title = "Publication Output by Research Domain",
    subtitle = "Physical Sciences lead, but Life Sciences growing rapidly",
    x = "Year",
    y = "Number of Publications",
    color = "Domain"
  )

p11

Evolution of research domains showing PS dominance with LS catching up

5.2 4.2 Relative Domain Share

Show code
# Calculate proportions
domain_share <- data_reg |> 
  filter(year >= cutoff_year) |> 
  select(year, LS, SS, PS, HS) |> 
  pivot_longer(cols = LS:HS, names_to = "domain", values_to = "value") |> 
  filter(value == 1) |> 
  count(year, domain) |> 
  group_by(year) |> 
  mutate(pct = n / sum(n) * 100)

# Plot
p12 <- ggplot(domain_share, aes(x = year, y = pct, fill = domain)) +
  geom_area(alpha = 0.7, position = "stack") +
  scale_fill_manual(
    values = colors_domain,
    labels = c("HS" = "Health Sciences", "LS" = "Life Sciences",
               "PS" = "Physical Sciences", "SS" = "Social Sciences")
  ) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Domain Share Over Time (Stacked %)",
    x = "Year",
    y = "Percentage of Publications",
    fill = "Domain"
  )

p12

Proportional representation showing PS declining share despite growth

5.4 4.4 Subject-Area Diversification

Show code
# Calculate subject diversity per year
subject_cols <- c("agr_bio", "art_hum", "bio_chem", "buss", "chem_eng", 
                  "chem", "comp_sci", "des_sci", "earth", "econ", "ener",
                  "egin", "env_sci", "immu", "mat_sci", "math", "med",
                  "neuro", "nurse", "pharm", "phys", "psy", "soc_sci",
                  "vet", "den", "heal")

# Only proceed if these columns exist
if (all(subject_cols %in% names(data_reg))) {
  diversity_trends <- data_reg |> 
    filter(year >= cutoff_year) |> 
    select(year, all_of(subject_cols)) |> 
    pivot_longer(cols = -year, names_to = "subject", values_to = "value") |> 
    filter(value == 1) |> 
    count(year, subject) |> 
    group_by(year) |> 
    mutate(
      prop = n / sum(n),
      diversity = -sum(prop * log(prop))
    ) |> 
    distinct(year, diversity)
  
  # Plot
  p14 <- ggplot(diversity_trends, aes(x = year, y = diversity)) +
    geom_line(color = "#2E86AB", linewidth = 1.2) +
    geom_smooth(method = "loess", se = TRUE, color = "#A23B72", 
                linetype = "dashed") +
    labs(
      title = "Subject-Area Diversification (Shannon Index)",
      subtitle = "Higher values indicate more diverse subject coverage",
      x = "Year",
      y = "Shannon Diversity Index"
    )
  
  p14
}

Shannon diversity index showing field diversification over time

6 5. Quality Metrics Evolution

6.1 5.1 Quartile Distribution Over Time

Show code
# Calculate quartile proportions
quartile_trends <- data_reg |> 
  filter(year >= cutoff_year, !is.na(quartile)) |> 
  count(year, quartile) |> 
  group_by(year) |> 
  mutate(pct = n / sum(n) * 100)

# Plot
p15 <- ggplot(quartile_trends, aes(x = year, y = pct, fill = quartile)) +
  geom_area(alpha = 0.7) +
  scale_fill_manual(
    values = c("Q1" = "#06A77D", "Q2" = "#F4A261", 
               "Q3" = "#E76F51", "Q4" = "#E63946")
  ) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Journal Quality Distribution Over Time",
    subtitle = "Q1 publications increasing as share of total",
    x = "Year",
    y = "Percentage",
    fill = "Quartile"
  )

p15

Journal quality distribution showing improvement toward Q1

6.3 5.3 Citation Patterns by Publication Cohort

Show code
# Calculate citation statistics by cohort
citation_cohorts <- data_reg |> 
  filter(year >= 2000, year <= 2020) |>  # Focus on cohorts with time to accumulate
  mutate(
    cohort = cut(year, 
                 breaks = seq(2000, 2020, by = 5),
                 labels = c("2000-2004", "2005-2009", 
                           "2010-2014", "2015-2019"),
                 right = FALSE)
  ) |> 
  filter(!is.na(cohort)) |> 
  group_by(cohort) |> 
  summarise(
    mean_cited = mean(cited, na.rm = TRUE),
    median_cited = median(cited, na.rm = TRUE),
    q75_cited = quantile(cited, 0.75, na.rm = TRUE),
    n = n(),
    .groups = "drop"
  )

# Plot
p17 <- ggplot(citation_cohorts, aes(x = cohort)) +
  geom_col(aes(y = mean_cited), fill = "#2E86AB", alpha = 0.6) +
  geom_point(aes(y = median_cited, color = "Median"), size = 4) +
  geom_point(aes(y = q75_cited, color = "75th percentile"), size = 4) +
  geom_text(aes(y = mean_cited, label = paste0("n=", n)), 
            vjust = -0.5, size = 3) +
  scale_color_manual(values = c("Median" = "#A23B72", 
                                "75th percentile" = "#E76F51")) +
  labs(
    title = "Citation Impact by Publication Cohort",
    subtitle = "Bars show mean citations; points show median and 75th percentile",
    x = "Publication Cohort",
    y = "Number of Citations",
    color = "Metric"
  )

p17

Citation accumulation by publication year cohort

6.4 5.4 Impact Maturation Curves

Show code
# Calculate citations per paper-age
impact_maturation <- data_reg |> 
  filter(year >= 2000, year <= 2020) |> 
  mutate(age = 2024 - year) |> 
  filter(age >= 4) |>  # At least 4 years old
  group_by(age) |> 
  summarise(
    mean_cited = mean(cited, na.rm = TRUE),
    median_cited = median(cited, na.rm = TRUE),
    n = n(),
    .groups = "drop"
  )

# Plot
p18 <- ggplot(impact_maturation, aes(x = age)) +
  geom_line(aes(y = mean_cited, color = "Mean"), linewidth = 1.2) +
  geom_line(aes(y = median_cited, color = "Median"), linewidth = 1.2) +
  geom_smooth(aes(y = mean_cited), method = "loess", 
              se = TRUE, linetype = "dashed", alpha = 0.2) +
  scale_color_manual(values = c("Mean" = "#2E86AB", "Median" = "#A23B72")) +
  labs(
    title = "Citation Accumulation by Paper Age",
    subtitle = "Average citations as papers mature",
    x = "Years Since Publication",
    y = "Citations",
    color = "Metric"
  )

p18

Citation accumulation curves showing time-to-impact patterns

7 Summary Statistics

7.1 Key Findings Table

7.2 Period Comparison Matrix

Show code
# Define periods for comparison
period_comparison <- data_reg |> 
  filter(year >= cutoff_year) |> 
  mutate(
    period = cut(year,
                 breaks = c(1990, 2000, 2010, 2020, 2025),
                 labels = c("1990-1999", "2000-2009", 
                           "2010-2019", "2020-2024"),
                 right = FALSE)
  ) |> 
  filter(!is.na(period)) |> 
  group_by(period) |> 
  summarise(
    `Total Pubs` = n(),
    `% OA` = round(100 * mean(OA == "OA", na.rm = TRUE), 1),
    `% Multilateral` = round(100 * mean(coop == "multilateral", na.rm = TRUE), 1),
    `% Q1` = round(100 * mean(quartile == "Q1", na.rm = TRUE), 1),
    `Avg SJR` = round(mean(sjr_score, na.rm = TRUE), 3),
    `Median Citations` = median(cited, na.rm = TRUE),
    .groups = "drop"
  )

period_comparison |> 
  kable(align = "lrrrrrr") |> 
  kable_styling(bootstrap_options = c("striped", "hover")) |> 
  row_spec(nrow(period_comparison), bold = TRUE, background = "#E8F4F8")
Key metrics comparison across time periods
period Total Pubs % OA % Multilateral % Q1 Avg SJR Median Citations
1990-1999 148 30.4 27.0 44.6 0.863 15
2000-2009 1009 33.9 31.4 44.9 0.894 24
2010-2019 3955 40.3 44.0 53.4 1.238 17
2020-2024 4865 51.7 51.6 61.0 1.331 7

8 Conclusions

8.2 Implications for Policy

  1. Sustained Growth Trajectory: The exponential growth pattern suggests Japan-Vietnam collaboration has reached critical mass and shows no signs of slowing.

  2. Internationalization Success: The shift to multilateral collaborations indicates successful integration into broader research networks beyond bilateral ties.

  3. Open Science Leadership: Early and rapid OA adoption positions this partnership as a leader in open science practices in Asia.

  4. Quality Over Quantity: Simultaneous increases in output AND quality (Q1 rates, SJR scores) demonstrate maturing research capacity.

  5. Disciplinary Balance: While Physical Sciences dominate, growth in Life Sciences and Health Sciences reflects diversification and alignment with SDGs.


9 Appendix: Data Quality Notes

Show code
# Calculate missingness
missing_summary <- data_reg |> 
  filter(year >= cutoff_year) |> 
  summarise(
    across(
      c(year, cited, OA, coop, quartile, sjr_score, n_countries),
      ~sum(is.na(.)) / n() * 100
    )
  ) |> 
  pivot_longer(everything(), 
               names_to = "Variable", 
               values_to = "Missing %") |> 
  arrange(`Missing %`)

missing_summary |> 
  mutate(`Missing %` = round(`Missing %`, 2)) |> 
  kable(align = "lr") |> 
  kable_styling(bootstrap_options = c("striped", "hover"))
Missingness and data completeness by key variables
Variable Missing %
year 0.00
cited 0.00
OA 0.00
n_countries 0.00
coop 0.41
quartile 6.70
sjr_score 6.70
Data Limitations
  • SJR data missing for ~6% of publications (journal title mismatches)
  • Cooperation type undefined for papers with only one country
  • Citation counts are time-dependent (older papers have more citations)
  • Funding data relies on manual categorization

10 Session Information

Show code
sessionInfo()
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.0.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] C.UTF-8/C.UTF-8/C.UTF-8/C/C.UTF-8/C.UTF-8

time zone: Asia/Ho_Chi_Minh
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] kableExtra_1.4.0 knitr_1.50       broom_1.0.9      patchwork_1.3.2 
 [5] scales_1.4.0     lubridate_1.9.4  forcats_1.0.0    stringr_1.5.2   
 [9] dplyr_1.1.4      purrr_1.1.0      readr_2.1.5      tidyr_1.3.1     
[13] tibble_3.3.0     ggplot2_4.0.0    tidyverse_2.0.0 

loaded via a namespace (and not attached):
 [1] gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.1     tidyselect_1.2.1  
 [5] xml2_1.4.0         textshaping_1.0.3  systemfonts_1.3.1  yaml_2.3.10       
 [9] fastmap_1.2.0      R6_2.6.1           generics_0.1.4     backports_1.5.0   
[13] htmlwidgets_1.6.4  svglite_2.2.2      pillar_1.11.0      RColorBrewer_1.1-3
[17] tzdb_0.5.0         rlang_1.1.6        stringi_1.8.7      xfun_0.53         
[21] S7_0.2.0           viridisLite_0.4.2  timechange_0.3.0   cli_3.6.5         
[25] withr_3.0.2        magrittr_2.0.3     digest_0.6.37      grid_4.5.1        
[29] rstudioapi_0.17.1  hms_1.1.3          lifecycle_1.0.4    vctrs_0.6.5       
[33] evaluate_1.0.5     glue_1.8.0         farver_2.1.2       codetools_0.2-20  
[37] rmarkdown_2.29     tools_4.5.1        pkgconfig_2.0.3    htmltools_0.5.8.1